import galah
import pandas as pd
import geopandas
import numpy as np
from dateutil.parser import parse
import matplotlib.pyplot as plt
import matplotlib as mpl
import alphashape
from flexitext import flexitextHumans’ movement across the globe has led to the accidental, and sometimes deliberate, transportation of species beyond their native habitats. In Australia since European colonisation, around 3,000 species have been introduced.
Within the last 200 years over 100 native species have gone extinct, with invasive species labelled as affecting 82% (1,257 of 1,533) of Australia’s threatened taxa in 2018. Since 1960, invasive species have cost the Australian economy at least $390 billion in damages, and are now considered a main driver of extinctions in native plants and animals.
However, species from outside of Australia aren’t the only ones that can encroach on other species’ habitats. Native Australian species can do it, too. Thanks in part to human activity, changing temperatures and more frequent extreme weather events, some Australian species have established themselves in new areas outside of their native range. Although not as popularly discussed, Australian species that have become pests in new habitats can disrupt ecosystems much like internationally invasive species.
In this post, we will use Python and the {galah} package to visualise how distributions of both international invasive species and native introduced pest species have shifted over time. To do this, we will use alpha shapes to visualise the distribution of Rhinella marina (Cane toads) since the 1930s and create a choropleth map to visualise the expanded habitat range of Pittosporum undulatum.
Invasive Species
Download data
To start, we will use the infamous example of the cane toad to illustrate how far an invasive species’ distribution can spread each decade.
First load the required Python packages.
Next, we will use the {galah} package to download occurrence records of cane toads in Australia from the Atlas of Living Australia (ALA). You will need to first provide a registered email with the ALA using galah.galah_config() before retrieving records.
# Add registered email (register at ala.org.au)
galah.galah_config(email = "your-email@email.com")
galah.galah_config(data_profile="ALA")cane_toads = galah.atlas_occurrences(taxa = "Rhinella marina", use_data_profile = True)
cane_toads.head(5)| decimalLatitude | decimalLongitude | eventDate | scientificName | taxonConceptID | recordID | dataResourceName | occurrenceStatus | |
|---|---|---|---|---|---|---|---|---|
| 0 | -38.300000 | 145.000000 | 1990-04-23T00:00:00Z | Rhinella marina | https://biodiversity.org.au/afd/taxa/e79179f8-... | 58772bea-1c61-4716-a453-f201e89fcc8d | Museums Victoria provider for OZCAM | PRESENT |
| 1 | -38.300000 | 145.000000 | 1990-07-21T00:00:00Z | Rhinella marina | https://biodiversity.org.au/afd/taxa/e79179f8-... | 141f0b10-bdf2-4239-80e4-1b5167e1f76c | Museums Victoria provider for OZCAM | PRESENT |
| 2 | -37.820000 | 145.230000 | 1990-04-14T00:00:00Z | Rhinella marina | https://biodiversity.org.au/afd/taxa/e79179f8-... | f6d07eb2-4f8f-476e-a212-726b10a4a745 | Museums Victoria provider for OZCAM | PRESENT |
| 3 | -37.800000 | 144.700000 | 2022-04-15T21:52:00Z | Rhinella marina | https://biodiversity.org.au/afd/taxa/e79179f8-... | 1e683cd7-88fd-44b4-98cf-7cffff12ebee | Earth Guardians Weekly Feed | PRESENT |
| 4 | -36.431411 | 148.329322 | 2017-03-07T00:00:00Z | Rhinella marina | https://biodiversity.org.au/afd/taxa/e79179f8-... | 549908bf-0b34-4227-9288-42a7fa52dabf | NSW BioNet Atlas | PRESENT |
Clean data
We’ll clean our data to ensure that there are no null or missing values in our coordinates and date fields. Because galah.atlas_occurrences() returns a Pandas dataframe, we have plenty of functions we can use to clean our data.
cane_toads = cane_toads.dropna(subset=["eventDate", "decimalLatitude", "decimalLongitude"])We want to map cane toad’s distribution each decade in our final visualisation. However, the eventDate value for each record is formaatted as a string value yyyy-mm-dd Thh:mm:ssZ. Let’s write our own function convert_date_to_decade() that extract the year from a date string and return its corresponding decade by rounding down to the nearest decade.
def convert_date_to_decade(value):
date = parse(value)
return date.year - (date.year%10)We’ll create our new decade column by mapping each record’s date value in eventDate to its corresponding decade value.
cane_toads["decade"] = cane_toads["eventDate"].map(convert_date_to_decade)Make Australia map
Next, let’s download a shapefile of Australia with state boundaries. The Australian Bureau of Statistics provides digital boundary files from which you can explore many other Australian shapefiles. Download the States and Territories - 2021 - Shapefile a zip folder. Save the zip folder inside your working folder and then unzip it to access the .shp file inside.
{GeoPandas} is a package that handles geospatial data in Python and can be used to load in shapefiles as GeoPandas dataframes. Let’s test this out by plotting our Australian state boundary shapefile.
mpl.rcParams['figure.dpi'] = 1200 # generate a high resolution image
states = geopandas.read_file("Australia_state_boundaries/STE_2021_AUST_GDA2020.shp")
states.plot(edgecolor = "#5A5A5A", linewidth = 0.5, facecolor = "white")Generate alpha shapes
Alpha shapes can be used to define and visualise the shape of a set of species occurrence points in space. They are useful because they can be generated on data-deficient species with few available observations, and without using environmental data or complex algorithms. Let’s use alpha shapes to see how cane toads’ distribution has changed each decade since they were introduced.
First, we need to obtain a list of all decades with cane toad observations. We’ll use the decade column from our cane_toads dataframe to group our observations.
decades = list(set(cane_toads["decade"]))We will be using the {alphashape} package to create alpha shapes representing the cane toad distribution for each decade they have been observed. The alphashape.alphashape() function requires two things:
- A set of observation coordinates
- An alpha parameter, which sets how tightly the shape’s lines conform to our observations
Let’s make an alpha shape for each decade’s observations. We’ll also add a slight buffer to each alpha shape to smooth out some of its edges. Then we’ll group all the shapes into one large GeoPandas dataframe.
We used alpha = 1, but it’s good practice to change this parameter depending on how widely distributed the coordinates of your data are. Also note that alphashape.alphashape() requires at least 3 data points to calculate an alpha shape.
alpha_shape_gdf = geopandas.GeoDataFrame() # GeoPandas data frame to contain all alpha shapes
for i, d in enumerate(decades):
decade_points = cane_toads[["decimalLongitude", "decimalLatitude"]] [cane_toads["decade"] == d]
if len(decade_points) <= 3:
continue
alpha_shape = alphashape.alphashape(decade_points, 1)
d = {"decade": d, "geometry": [alpha_shape.buffer(0.2)]}
tmp_gdf = geopandas.GeoDataFrame(d, crs="EPSG:7844")
alpha_shape_gdf = pd.concat([alpha_shape_gdf, tmp_gdf])Next, let’s clean up our GeoPandas dataframe so that it is ready for plotting! Sometimes the alphashape.alphashape() algorithm will produce an empty shape that needs to be removed from the dataframe (this generally happens when the chosen alpha parameter is not appropriate for the supplied set of points). Let’s remove these shapes from our data.
alpha_shape_gdf = alpha_shape_gdf[~alpha_shape_gdf["geometry"].is_empty]Now let’s format our decade string to display correctly on the figure legend by making sure it’s in YYYYs format.
alpha_shape_gdf["decade_string"] = alpha_shape_gdf["decade"].map(lambda d: str(d) + "s")Finally, because we expect cane toad distributions in earlier decades to be smaller than in recent decades, we’ll need to plot earlier distributions on top of later distributions to avoid covering the earlier ones up. To achieve this, let’s order the alpha shapes in descending order by decade.
alpha_shape_gdf.sort_values(by='decade', ascending=False, inplace=True)Map alpha shape distributions
Finally, we can plot our alpha shape distributions for each decade onto our map of Australia!
This figure showcases the incredible pace of the cane toad’s spread across northern Australia. Our map shows that cane toads have spread across most of Queensland, the top end of the Northern Territory (from the 1980s to 2010s) and more recently, into the Kimberley region of Western Australia.
ax = states.boundary.plot(edgecolor="#5A5A5A", linewidth=0.5, facecolor="white", zorder=-1)
alpha_shape_gdf.plot(ax = ax, cmap="plasma", column = "decade_string", legend=True, categorical=True)
lgd = ax.get_legend()
lgd.draw_frame(False)
lgd.set_bbox_to_anchor((1.2, 0.8))
title_text = "<style: italic>Rhinella marina</> (cane toad) distributions per decade"
flexitext(0.5, 1, title_text, va="bottom", ha="center");
caption_text = "<color:#5A5A5A, style:italic, size:7>Distributions calculated with alpha hulls of each decade's cane toad observations</>"
flexitext(0.05, 0, caption_text, va="top");
plt.xlim([110, 161])
plt.ylim([-45, -8])
plt.axis("off")
plt.subplots_adjust(left=-0.15, right=1)
plt.show()Other invasive species
Let’s use the same code as above to visualise other invasive species Camelus dromedarius (Feral dromedary camels) and Echium plantagineum (Paterson’s curse).
# Camel
camels = galah.atlas_occurrences("Camelus dromedarius", use_data_profile="ALA")
camels = camels.dropna(subset=["eventDate", "decimalLatitude", "decimalLongitude"])
camels["decade"] = camels["eventDate"].map(convert_date_to_decade)
decades = list(set(camels["decade"]))
alpha_shape_gdf = geopandas.GeoDataFrame() # GeoPandas data frame to contain all alpha shapes
for i, d in enumerate(decades):
decade_points = camels[["decimalLongitude", "decimalLatitude"]] [camels["decade"] == d]
if len(decade_points) <= 3:
continue
alpha_shape = alphashape.alphashape(decade_points, 1)
d = {"decade": d, "geometry": [alpha_shape.buffer(0.2)]}
tmp_gdf = geopandas.GeoDataFrame(d, crs="EPSG:4326")
alpha_shape_gdf = pd.concat([alpha_shape_gdf, tmp_gdf])
alpha_shape_gdf = alpha_shape_gdf[ ~alpha_shape_gdf["geometry"].is_empty]
alpha_shape_gdf["decade_string"] = alpha_shape_gdf["decade"].map(lambda d: str(d) + "s")
alpha_shape_gdf.sort_values(by='decade', ascending=False, inplace=True)
ax = states.boundary.plot(edgecolor="#5A5A5A", linewidth=0.5, facecolor="white", zorder=-1)
alpha_shape_gdf.plot(ax = ax, cmap="plasma", column = "decade", legend=True, categorical=True)
lgd = ax.get_legend()
lgd.draw_frame(False)
lgd.set_bbox_to_anchor((1.2, 0.61))
title_text = "<style: italic>Camelus dromedarius</> (dromedary camel) distributions per decade"
flexitext(0.5, 1, title_text, va="bottom", ha="center");
caption_text = "<color:#5A5A5A, style:italic, size:7>Distributions calculated with alpha hulls of each decade's dromedary camel observations</>"
flexitext(0.05, 0, caption_text, va="top");
plt.xlim([110, 161])
plt.ylim([-45, -8])
plt.axis("off")
plt.subplots_adjust(left=-0.1, right=1)
plt.show()
# Paterson's Curse
opuntia = galah.atlas_occurrences("Echium plantagineum", use_data_profile="ALA")
opuntia = opuntia.dropna(subset=["eventDate", "decimalLatitude", "decimalLongitude"])
opuntia["decade"] = opuntia["eventDate"].map(convert_date_to_decade)
decades = list(set(opuntia["decade"]))
alpha_shape_gdf = geopandas.GeoDataFrame() # GeoPandas data frame to contain all alpha shapes
for i, d in enumerate(decades):
decade_points = opuntia[["decimalLongitude", "decimalLatitude"]] [opuntia["decade"] == d]
if len(decade_points) <= 3:
continue
alpha_shape = alphashape.alphashape(decade_points, 1)
d = {"decade": d, "geometry": [alpha_shape.buffer(0.2)]}
tmp_gdf = geopandas.GeoDataFrame(d, crs="EPSG:4326")
alpha_shape_gdf = pd.concat([alpha_shape_gdf, tmp_gdf])
alpha_shape_gdf = alpha_shape_gdf[ ~alpha_shape_gdf["geometry"].is_empty]
alpha_shape_gdf["decade_string"] = alpha_shape_gdf["decade"].map(lambda d: str(d) + "s")
alpha_shape_gdf.sort_values(by='decade', ascending=False, inplace=True)
ax = states.boundary.plot(edgecolor="#5A5A5A", linewidth=0.5, facecolor="white", zorder=-1)
alpha_shape_gdf.plot(ax = ax, cmap="plasma", column = "decade", legend=True, categorical=True)
lgd = ax.get_legend()
lgd.draw_frame(False)
lgd.set_bbox_to_anchor((1.2, 0.85))
title_text = "<style: italic>Echium plantagineum</> (Paterson's curse) distributions per decade"
flexitext(0.5, 1, title_text, va="bottom", ha="center");
caption_text = "<color:#5A5A5A, style:italic, size:7>Distributions calculated with alpha hulls of each decade's Paterson's curse observations</>"
flexitext(0.05, 0, caption_text, va="top");
plt.xlim([110, 161])
plt.ylim([-45, -8])
plt.axis("off")
plt.subplots_adjust(left=-0.1, right=1)
plt.show()Native introduced pest species
When people think of invasive species, they generally think of species that have been introduced to Australia from other countries. However, even Australia’s native species can become pests when introduced to a new ecosystem.
One good example of native pests are the trees Pittosporum undulatum (sometimes called Sweet Pittosporum). These trees have been introduced as ornamental plants in gardens across Australia because of their sweet-scented flowers and bright berries. Although Pittosporum undulatum’s native range extends from southern Queensland to eastern Victoria, it is now considered an environmental weed in many regions where it has been introduced.